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Abstract 

Three situations in which filtering theory is used in mathematical finance are il- 
lustrated at different levels of detail. The three problems originate from the following 
different works: 

1) On estimating the stochastic volatility model from observed bilateral exchange 
rate news, by R. Mahieu, and P. Schotman; 

2) A state space approach to estimate multi-factors CIR models of the term struc- 
ture of interest rates, by A.L.J. Geyer, and S. Pichler; 

3) Risk-minimizing hedging strategies under partial observation in pricing finan- 
cial derivatives, by P. Fischer, E. Platen, and W. J. Runggaldier; 

In the first problem we propose to use a recent nonlinear filtering technique based 
on geometry to estimate the volatility time series from observed bilateral exchange 
rates. The model used here is the stochastic volatility model. The filters that we 
propose are known as projection filters, and a brief derivation of such filters is given. 
The second problem is introduced in detail, and a possible use of different filtering 
techniques is hinted at. In fact the filters used for this problem in 2) and part of the 
literature can be interpreted as projection filters and we will make some remarks on 
how more general and possibly more suitable projection filters can be constructed. 
The third problem is only presented shortly. 

*This work was developed while the first named author was working at the Risk Management depart- 
ment of Cariplo Bank. A related paper appeared later on in: Insurance. Mathematics and Economics, 
22(1) (1998) pp. 53-64. 
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1 Introduction 

The filtering problem consists of estimating a stochastic process X t representing an un- 
observed signal, on the basis of the past and present observations {Y s : < s < t} 
of a related measurement process Y . The information given by the measurement process 
up to time t is represented by the a-algebra y t generated by {Y s : < s < t}. For a 
quick introduction to the filtering problem see Davis and Marcus (1981) [10]. For a more 
complete treatment see Liptser and Shiryayev (1978) [21] from a mathematical point of 
view or Jazwinski (1970) [19] for a more applied perspective. The solution of the filtering 
problem is the conditional density Px t \y t °f the signal X t given the observations y t . Such a 
solution in general takes its values in an infinite dimensional function space in an essential 
way, as proven in Chaleyat-Maurel and Michel (1984) [9]. As a consequence, in general 
the filter cannot be implemented by an algorithm which updates only a finite number of 
parameters. This means that there can be no finite-memory computer implementation. 
An important exception is the linear- Gaussian case, where the solution px t \y t is Gaussian 
at all time instants, and as such can be parameterized by mean and variance. This is the 
well known Kalman filter. 

In the present paper we investigate three possible roles of filtering theory in mathemat- 
ical finance. 

The first problem concerns the stochastic volatility models. In recent applications, time 
varying volatility of financial time series has been modelled according to the stochastic 
volatility model, where the variance is considered to be a stochastic process representing 
an unobserved component. There are several reasons for which such a model represents 
a convenient choice: among them, the fact that such models are related to the type of 
diffusion processes one encounters in finance (asset pricing theory, see Melino and Turnbull 
(1990) [23]). Once the type of model is chosen, there are two problems to be solved: 

i) estimate the model parameters on the basis of the observed bilateral exchange rates; 

ii) estimate the volatility time series on the basis of the observed bilateral exchange 
rates. 

We develop point ii) by suggesting a different approach based on the projection filter 
of Brigo, Hanzon and Le Gland (1995) [7], (1997) [8]. 

We continue by considering as a second problem the state space approach of Geyer 
and Pichler (1996) [15]. Such an approach is used to estimate and test multi-factors Cox- 
Ingersoll-Ross (CIR) models of the term structure of interest rates. We concentrate on the 
estimation procedure. We report the quasi-maximum-likelihood approach combined with a 
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Kalman filter as suggested by Geyer and Pichler, and we also hint at a possible completely 
Bayesian approach which is sometimes used in system identification. 

This state-space approach is convenient for several reasons. The model is estimated, as 
in the classical cross-section approach, from observations of yields. However, in the state- 
space approach yields are modelled by taking in account some noise. In this way, market 
imperfections and deviations from the true model are taken in account. Other advantages 
are listed in the section of the paper devoted to this approach, and are presented in larger 
detail in Geyer and Pichler (1996) [15]. 

The third problem presented concerns risk-minimizing hedging strategies under partial 
observation in pricing financial derivatives, and is reported as from Fischer, Platen and 
Runggaldier (1996) [13]. This result is reported and commented in a concise fashion, since 
it has been thoroughly developed by the authors. It is an excellent example of how filtering 
theory can fit nicely the mathematical-finance setup, and such examples are rare in the 
literature. 

2 On estimating the stochastic volatility from observed 
bilateral exchange rate news 

2.1 Introduction 

The main problem econometricians face when dealing with a stochastic volatility model is 
the intractability of the likelihood function. In fact, the function turns out to involve a 
multiple integration, due to the unobserved stochastic variance. One can try to remedy this 
situation by using a quasi maximum likelihood (QML) method. Another possible remedy 
is the method of moments estimation (MME). Unfortunately, it has become clear that both 
methods are not always reliable (see Jacquier, Poison and Rossi (1994) [18] and Andersen 
(1994) [2]). In Mahieu and Schotman (1997) [22] a study of several possible estimation 
techniques is presented, and once the model has been estimated a Kalman smoother is 
applied to estimate the volatility time series. In order to do this, the model is transformed 
into a linear one and approximations are made to express the new additive noise, whose 
exact distribution is a log chi-squared. Some possibilities include the approximation of 
such new noise by a Gaussian of mean —1.27 and variance tt 2 /2 (QML). Another pos- 
sible choice is to approximate the new noise via a mixture of Gaussian densities which 
should approximate the log chi-squared distribution and other possible noise-distributions 
in a rather satisfactory way. In Mahieu and Schotman (1997) [22] an application of all 
the mentioned techniques to financial data is considered, and conclusions are drawn. In 
the following we suggest a different possible approach to the estimation of the volatility 
time series from observed bilateral exchange rates. Once the model has been estimated, 
instead of transforming the original (nonlinear) stochastic volatility model into a linear one 
and approximating the log chi-squared noise, we keep the original nonlinear system with 
Gaussian white noise and we propose to adopt nonlinear filtering techniques in order to 
estimate the volatility. The nonlinear filters we use are the projection filters, which were 
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defined and investigated in continuous time in Hanzon (1987) [16], Hanzon and Hut (1991) 
[17], Brigo (1995) [4], (1996) [5], [6], and Brigo, Hanzon and Le Gland (1995) [7], (1997) 
[8]. In this paper we give a short derivation of the projection filter in discrete time, and 
we apply the theory for discrete time projection filters to the stochastic volatility model. 

In general, our method features the advantage of fully taking in account the nonlinear 
nature of the model adopted. We do not transform the model, so that, once it has been 
estimated, the only approximation involved in the estimation of the volatility time series 
is in the filtering technique adopted. In a near future, we plan to analyze the quality of 
such approximation by means of auxiliary quantities associated to the projection filter. 

2.2 Finite dimensional approximation via minimization of the 
Kullback— Leibler information 

In this section we introduce briefly the Kullback-Leibler information and we explain its 
importance for our problem. Suppose we are given the space H of all the densities of 
probability measures on the real line equipped with its Borel field, which are absolutely 
continuous w.r.t. the Lebesgue measure. Then define 

D(pi,p 2 ) := Ep^logp! - logp 2 } > 0, p 1: p 2 eH, (1) 

where in general 

E p {4>} = J (f)(x)p(x)dx, p E H. 

The above quantity is the well-known Kullback-Leibler information (KLI). Its non- negativity 
follows from the Jensen inequality. It gives a measure of how much the density p 2 is dis- 
placed w.r.t. the density p\. We remark the important fact that D is not a distance: in 
order to be a metric, it should be symmetric and satisfy the triangular inequality, which 
is not the case. However, the KLI features many properties of a distance in a generalized 
geometric setting (see for instance Amari (1985) [1]). For example, it is well-known that 
the KLI is infinitesimally equivalent to the Fisher information metric around every point of 
a finite-dimensional manifold of densities such as EM(c) defined below. Consider a finite 
dimensional manifold of exponential probability densities such as 

EM(c) = {p(-, 6) : 9 e C FT}, 6 open in R m , (2) 
p(-,6) = exp[0 lCl O + ... + m c m (-) 

expressed w.r.t the expectation parameters rj defined by 

■ni{0) = E M {c i } = d e ^{9), i = l,..,m (3) 

(see for example Brigo, Hanzon and Le Gland (1997) [7] for more details). We define 
p(x;r)(0)) := p(x,9) (the semicolon identifies the parameterization). Now suppose we 
are given a density p G H, and we want to approximate it by a density of the finite 
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dimensional manifold EM(c). It seems then reasonable to find a density p(-,9) in EM(c) 
which minimizes the Kullback Leibler information D(p, .). Compute 

mm D(p,p(-, 9)) = mm{E p [\ogp - logp(-, 9)]} 

6 u 

= E p \ogp - max{6»iE p ci + ... + 9 m E p c m - tp(9)} 

9 

= E p \ogp — maxV(9), 

9 

V{9) := 9 1 E p c 1 + ... + 9 m E p c m -^(9). 

It follows immediately that a necessary condition for the minimum to be attained at 9* is 
dey{9*) = 0, % — 1, ...,m which yields 

E p Ci - d Si ip(9*) = E p d - E p (e*)Ci = 0, i = l,..,m 

i.e. EpCi = r/i(9*), i = l,..,m. This last result indicates that according to the Kullback 
Leibler information, the best approximation of p in the manifold EM(c) is given by the 
density of EM(c) which shares the same q expectations (cj-moments) as the given density 
p. This means that in order to approximate p we only need its q moments, i = 1, 2, .., m. 

One can look at the problem from the opposite point of view. Suppose we decide to 
approximate the density p by taking in account only its m Q-moments. It can be proved 
(see Kagan, Linnik, and Rao (1973) [20], Theorem 13.2.1) that the maximum entropy 
distribution which shares the c-moments with the given p belongs to the family EM(c). 

Summarizing: If we decide to approximate by using c-moments, then entropy analysis 
supplies arguments to use the family EM(c); and if we decide to use the approximating 
family EM(c), Kullback-Leibler says that the "closest" approximating density in EM(c) 
shares the c-moments with the given density. 

2.3 The stochastic volatility model 

Let {S t , t G T}, T = {0, 1, 2, 3, ...} be a stochastic sequence describing bilateral exchange 
rates in time, and define Y t := log^+i — log Si, t e T. Assuming that the change Y t 
of log ^ is unpredictable, the standard stochastic (logarithmic autoregressive) volatility 
model (SVM) is given by 

X t+1 = P X t + aW t+1 , (4) 
Y t = exp(^p)^, 

where {W s , s G T} and {V s , s G T} are independent standard Gaussian white noise 
processes and p, a, 7 are real constants. Usually the initial condition Xo features a non 
informative density px - In such models the exchange rate features a fat tailed distribution 
due to the mixing of V t and exp[(X t + r y)/2]. Consider the following nonlinear filtering 
problem: 
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Estimate the stochastic volatility time series exp[(X t + r y) /2] at timet from the following 
observations 

Y^.= {Y s ,seT,s<t} (5) 

of the changes in the logarithms of the bilateral exchange rates up to time t. 

The general solution of such a problem consists of the conditional probability density 
Px t \Y*i whose knowledge allows one to compute, among other estimates, the minimum mean 
square error estimate E{exp[(X t + 7)/2]|Yq} of the stochastic volatility. Such conditional 
densities obey the following Bayes formula: 

, \ PY t+1 \x t+ A Y T+i]x) J-™ Px t+1 \x t (x]u)p Xt]Y t(u) du 

Px t+l \Y*+W = J^y^ > ( 6 ) 

(7) 

/+oo r+oo 
PY t+1 \x t+1 (y,0 / Px t+1 \x t (£,;u)p XtlY t(u) du d£. 
-oo J — oo 

From the structure of the processes X t and Y t and from the assumptions on the noises 
V t and W t it follows immediately that PY t \x t (y,x) = PAf(o, cxp ( x+ -y))(y) and p Xt+1 \x t (x] u) = 
PAf(x,a 2 )(pu)- Bayes' formula reads now 

(y) /-« PM(x,*z){pu)p Xt \ Y Au) du 

Px t+1 \Yr^ = j^) 9 • ( g ) 

This is the exact solution of our filtering problem. However, this is very difficult to compute. 
Assume for example that we can deal with the numerical integration involved above. The 
problem is that in order to obtain the density at time t + 1, given the density a time t, 
one has to update the given density point by point in the whole real line. In the next 
section we suggest a finite dimensional filter which approximates the exact filter found in 
this section. 



2.4 A projection filter for the stochastic volatility model 

Consider now the family EM(c) of exponential densities defined in section (2.2). More 
specifically, we take the exponential manifold EP(m) := {p(-,0) : 9 E C lR m }, with m 
an even positive integer and with a linear combination of the monomials x, x 2 , . . . ,x m in 
the exponent: 

p(x, 9) = exp{9 lX + ... + 9 m x m - ^(0)}, 9 m < 0. (9) 

In section (2.2) we showed that in order to approximate the density p = p Xt \Y l with a den- 
sity p(-, 9) of EM(c), it suffices to find the density in EM(c) such that the Q-expectations 
of p and p(-, 9) match. With our specific manifold EM(c) = EP(m), these expectations are 
exactly the first m moments of the exponential density. Then, in computing the projection 
filter, we update only the first m moments. Suppose we have computed the projection 
filter at time t via the expectation parameters r)i(t), ...,r) m (t). Bayes' formula yields 

^ | ^ _ J-o?^W(o,cxp(x+ 7 ))(?/)/- ( ^P^( :E , ( x2)(pM)p(M;r7(t)) du dx . = ^ ^ 
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which permits to update the expectation parameters. Then the new density r)(t + 1)) 
may be computed recursively from the previous one p(-;i](t)). If one prefers to avoid 
normalization at every step, one can use the scheme 



<Xj(t 



+00 



^Paa(o,cx P (^+ 7 )) (y) 
ai/cto, i = l,...,m, 



+00 



PAf(x,^)(pu)q(u;a(t)) du dx, j = 0,...,m, 

(10) 



where q(-; a) is the unnormalized exponential density of the family {exp(6 l o + 0\x + . . . + 
0mX m );0 m < 0}, characterized by the unnormalized expectation parameters cej, % = 



0, l,..,m. Initially, at t — 0, one can take cto(0) = l,a:j(0) 
expanding this last expression one obtains 



Vi(0) 



l,...,m. By 



/+00 
{x 3 exp[ 
-00 



00 
+00 



X + 7 1 2 1 2 -X--Y1 

;r v e 



;i2) 



/+00 j 
exp[— — — ■(— 2pxu + p 2 u 2 )]q(u; a(t)) du }dx, j = 0, m. 
-00 2<j 



This last equation yields the evolution of the m + 1 parameters a characterizing the pro- 
jection filter for EP(m). However, there are some problems in implementing this equation. 
Mainly, we need a way to express the exponential density p(-; rj) explicitly from the knowl- 
edge of the 77. Actually, from the theory of exponential families (see Brigo (1996) [6], 
Chapter 3 and references given therein) we know that the expectation parameters r\ char- 
acterize the densities of EP(m), but we do not know a direct way to express the densities 
on the basis of such parameters. On the contrary, from (9) it is clear that the canoni- 
cal parameters 6 permit to express the densities of EP(m) explicitly. In Brigo (1996) [6] 
(lemma 3.3.3) we give a recursive formula for EP(m) which allows one to compute the 
last expectation parameter i] m and the higher order moments i] m+ i = E p (.^{x m+l } for all 
nonnegative integers i, on the basis of the canonical parameters and of the first m — 1 
expectation parameters 771, ...,r) m -i. Define the matrix M(rj) as follows: 

M iAv) -=Vi+j, i,3 = 1,2, ...,m. (13) 
It is easy to verify that lemma (3.3.3) of Brigo (1996) [6] implies the following formula: 



0i 
20 2 

m0 m 



= -M(^)- 1 



2 Vl 
(m + l)rj r , 



(14) 



From this last equation it follows that we can recover algebraically the canonical parameters 
from the knowledge of the moments rji, ... , 772™ up to order 2m. Then we can compute 
the projection filter according to the following scheme: 
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(i) Given the initial density p(x, 9(0)) = px (x), set t — 0. 

(ii) Assign t :— t + 1. 

(iii) Compute the first m moments of the new projection filter density at time t via the 
formula 

aj (t) = |_ + Vexp[-^-^ 2 -iy 2 e-^] 

/+oo ]_ 
exp[— — -(— 2pxu + p 2 u 2 )]p(u] 9(t — 1)) du }dx, j = 0, m, 
-oo 2,u 

rji(t) = ai(t)/a (t), i = l,...,m. 

(iv) Recover the canonical parameters 9(t) from the moments rji(t) : . . . : rj m (t) (What is 
the best way of doing this is still under investigation). 

(v) Estimate the stochastic volatility by evaluating numerically the integral 

X + r + OO <£ _|_ ry 

E mt)) {exp(^—)} = j ^ exp(— ^— ) p(x,9(t)) dx. 

(vi) Start again from (ii). 

A possible problem in applying the above scheme is that for the integrals appearing in 
(iii) and (v) there are apparently no closed form expressions while the numerical integration 
is a subtle problem in this case. One of the difficulties in the numerical evaluation of the 
above integrals is that if the filter performs very well then the resulting density becomes 
very peaked, so that special numerical integration techniques are required. This problem 
is currently under investigation. 

A possible heuristic answer to the problem under investigation in point (iv) is to replace 
points (iii) and (iv) by the following: 

(iii. a) Compute the first 2m moments of the new projection filter density at time t (j and 
i range now up to 2m). 

(iv.a) Recover the canonical parameters 9(t) from the moments r/i(t), . . . ,r] 2m (t) by using 
(14). 

For a study of the behaviour of such a heuristic procedure, in a slightly different context, 
and for a comparison to several alternatives, including a Newton method, see Borwein and 
Huang (1995) [3]. Further investigations into this so called polynomial moment problem 
are called for. Better insight into the geometry of the manifolds EP(m) is likely to be 
helpful, especially to understand the behaviour of the various algorithms at the boundary 
of the manifold where 9 m is close to zero. 

Concerning the scheme as a whole, difficulties in numerical integration in the various 
steps are still present. A good performance of the above scheme is not guaranteed and it 
should be tested on simulations. We hope to return to this matter in future research work. 
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3 A state space approach to estimate CIR models of 
the term structure of interest rates 



We consider one of the most popular models of the term-structure of interest rates: the 
multi-factor Cox-Ingersoll-Ross (CIR) model. In this model one assumes the instantaneous 
spot interest-rate r to be the sum of K factors X which follow a square-root process under 
the objective probability measure P: 



r t = X} + ... + X t K , dXi = k j (0 j -Xi)dt + a j Jx]dWi , j = l,...,K. 



(15) 



Let {J- t , t > 0} be the filtration representing the information available through time. 
With some reasonable requirements on the parameters k, 9 and a, this model yields an 
almost surely positive spot-rate r t for all t > 0. This is generally considered as one of the 
main advantages of the CIR model. The term structure is expressed by specifying the price 
Pt(T) at any time t for a bond which pays 1 at the maturity time t + T. In order to be able 
to price such bonds and specify the term structure of interest rates, one needs to specify 
the attitude towards risk. This is done by specifying the so-called equivalent martingale 
measure Q or risk neutral measure. For simplicity, this measure is taken of a form such 
that under Q the factors X still follow a square root process of the CIR type: 



dP l:Ft 



exp 




A 



_^ / xids+^- r 

2crj Jo <Jj Jo 



'Xi dWl 



Under the risk-neutral measure Q the factors follow the equation 



dXl 



[kj 0j - (kj + \j)X}\dt + Oj\jxidWi , j = 1, . . . , K , 



where W is a standard Brownian motion under the risk- neutral measure Q. The attitude 
towards risk can be tuned by the parameters Ai, Xk, the so called market prices of risk. 
Set a = (A, k, 9, a). Yields are given by 



Vt(T,a) 
i/i(aj,T) 



h = 



log Pt(T) 
T 



K 



5>g 0K-,T)-^K,T)^], 



1 7 = 1 



2\[h exp{(A; i + \ + Vh)T/2} 
1\fh + (kj + Xj + \/h)(exp{TVh} - 1) 

2(exp{T^} - 1) 
2Vh + (kj + Xj + Vh)(exp{TVh} - 1) ' 

(kj + A,) 2 + 2a) . 



which are afline functions of the factors X. This is a second advantage of the CIR model: 
it yields an afline term-structure. 
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Once this type of model has been established, one is confronted with the task of es- 
timating the model parameters a = (A, k, 6, a) on the basis of the available information. 
This problem is usually treated in two ways, as explained in Geyer and Pichler (1996) [15]. 

1) The cross section approach: One fits the quantities yt(T, a) given above to observed 
yields in different periods of time, finding in each period the parameter values for 
which the model yields y t (T,a) are closest to the actually observed yields in that 
period. The main objections to this approach are that the parameter estimates in 
general are not the same in different periods of time, and the fact that even if they 
were the same, the real dynamics of the spot rate r need not follow the CIR structure. 

2) The time series approach: One fits the SDE's for the X's (usually for only one factor) 
to observable proxies of Xi (e.g. prices of T-bills or money-market rates). This 
approach raises the following objection: fitting to different proxies usually produces 
different estimates for the same parameters, so as to be inconsistent with the no- 
arbitrage conditions. Moreover, this approach does not use available information 
coming from observed yields. 

The following state space approach answers the above objections by using both the 
CIR dynamics and the observed yields' cross section without the above inconsistencies. 

The idea can be described as follows: assume that the observed yields differ from the 
yields yt(T, a) prescribed by the model by a white noise process whose variance 5 2 is a 
new parameter to be estimated. This noise process can be viewed as a tool for taking into 
account market imperfections and deviations from the true model. Among the possible 
advantages of the state-space approach (over the pure cross-section approach and the time- 
series approach) stated by Geyer and Pichler (1996) [15] we recall the following: 

• There is no need to rely on proxies for the factors X, contrary to the time-series 
approach; 

• It is possible to estimate the parameters themselves rather than non-invertible func- 
tions of them; 

• It is possible to estimate the factors X themselves, not only the parameters of the 
model; 

• Measurement errors are taken into account explicitly. 

Let us formalize the observation process as follows: r t is the vector of the rit maturities 
at time t, e is a discrete-time white noise process, and Y is the process of observed yields, 
where the capital letter is used to distinguish between actually observed yields Y and the 
yields y of the CIR model. 
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1 K ih(n ■ T l ) 

r t := [T t \...,Tn T , XiM = -= Elog^a,-,??), *y(a,r t ) = 

* j=i * 

(16) 

Y? := y t {T l t ) + fcj = n) + n)X t + 5, e t , i = l..n 4 . 

In vector form the observation process reads Y t = x{ a -> T t) + ^( a , T t)X t + Diag(5i, .., 5 nt ) e t , 
where the dimension n t of the vector varies over time with the number of maturities. 

Now there are essentially two main possibilities for introducing filtering theory in this 
setup. 

3.1 Completely Bayesian approach 

The first approach is completely Bayesian, and is used in system identification. It consists 
of viewing the parameters as new state variables in order to reduce the problem to a 
nonlinear filtering problem. Set 

{X« +i , X 2K+j , X? K+j , Xt K+j , X^) := (kj, Oj, a v \ v 5,), j = 1, k, i = l,...,n t . 

In such a way, the equations of the system (15,16), including the new state variables are: 

dX? +r = 0, r = l,...,AK + nt, 

dX{ = X? + \X 2 t K+ > - Xl)dt + X? K+j Jx]dW t j , j = 1, . . . , K, 

m j=l 

-MX AK+j X K+j X 3K+j X 2K+j T 1 )X j 1 + X 5K+1 e 1 

1 K 

Y nm = — Vflog (b(X 4K+j X K+j X 3K+j X 2K+j T nm ) 

m j=l 

This is a filtering problem with continuous time state X and discrete time observations 
Y, as described for example in Jazwinski (1970) [19]. Indeed, the unobserved signal is X, 
and the observation process Y consists of a deterministic functional of X plus some noise 
X e. Notice that the noise is state dependent, since components of the state X appear in 
front of the white noise process e. The above filtering problem is nonlinear, and as such 
is infinite dimensional. An approximation of its solution can be considered. For example, 
one can use the extended Kalman filter (see again Jazwinski (1970) [19]) even though no 
general analytical result on the quality of the filter estimates is available. Justifications of 
the use of this filter are usually based on heuristics. 
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3.2 Quasi Maximum Likelihood 

This method is based on an approximate computation of the likelihood function. Consider 
equations (15) for the factors of the CIR model. One of the advantages of square root 
processes like X is that they yield closed formulas for the mean and the variance of the 
factors themselves. This is somewhat helpful in establishing approximations, although 
nonlinearities in (15) imply that mean and variance are not sufficient to characterize the 
probabilistic behaviour of the factors X, contrary to the linear case. Indeed, the factor X j 
features a non-central x 2 transition density. Define X^ s = E{Xf\Yi, . . . ,Y S } and V t J ' s J = 

E{(X{ - X| a ) 2 |yi, ...,Y S } for j = 1,...,K and for any < s < t. From the above 
considerations it follows easily that between two observations, for m < t < m + 1, the 
prediction step is given by 

K + i\m = ^[l-exp(-%)]+exp(-%)A^ |m , 

(17) 



C h = ° t ^ ^ %)) +exp(-%)^ | J+exp(-2^K J ' J 



mini 



Notice that even if at a certain time the conditional density p 3 m \ m of X J m given Yi, . . . , Y m 
were Gaussian, i.e. 

pL\m~M(X J mlm ,V mlm ), 
the prediction step would lead us out of the Gaussian family: 

Pm+l\m 

Therefore, T? m +\\m ^ s no ^ Gaussian and its mean X^ m+X ^ m and variance V^'^. are not enough 
to activate the correction step (Bayes' formula) leading to the conditional density p 3 m+ i\ m+ i- 
In order to avoid such difficulties, one can replace the real p 3 m+ i\ m by A/"(X^ +1 | m , V^ 3 +1 ^ m ), 
i.e. replace the density P„ +1 | m by a Gaussian density sharing its first two moments. This 
is actually what is done in Geyer and Pichler [15]. As we remarked earlier in Section 2.2, 
this amounts to replacing p 3 m+1 \ m by its best approximation, in the Kullback-Leibler sense, 
of the Gaussian family. Therefore the approximate filter used here can be interpreted 
as a Gaussian projection filter! By this approximation, it follows that the approximated 
correction at t — m + 1, when Y m+ i is available, is given by Bayes' formula and can be 
summarized by 

A m := Diag(<5i,..,<5 n J, 



X m +i\m+i = {X m+ i\ m + V m+1 \ m ty (a, r m+ i) ^(a, T m+1 )V m+ i\ m fy(a, T m+1 ) + A m+1 
(Vm+i - x(«> T m+i) ~ r m +i)X m+1 | m )} + , 



(18) 
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Kn+l|m+l 



^ T (a, T m+1 )A m 2 +1 ^(a, r m+1 ) + V~l llm 



The symbol { ■ } + in the above equation denotes the positive part. It is applied in order 
to make sure that the approximate conditional mean X be positive. We can now calculate 
the quasi-likelihood function as follows: Set j3 — (a, 5) and compute 

PY lt ...,Y„(yi,-,yn,P) =Vy _y , (Vn ~ %\n-l\ P)P Y , _y , (Vn-1 ~ %-l\n-2, P) • • • 

1 n 1 n\n — 1 1 J n— 1 J n — l|n — 2 1 

= PAr(0,*(a,r n )V' n | n _ 1 >I>( a ,T n )^+A2)(l/n ~ ^n|n-l) 

PAr(0,*(a,r n _ 1 )y n _ 1 | n _ 2 v&( a ,T n _ 1 )^+A2_ 1 )(l/r i -l ~ F n _i|„_ 2 ) • • •p^(o,*(a ) Ti)Vi*(a,Ti) T +A?)(2/l ~ ^l)- 

This function can be computed (and maximized) once we know Y and V for all f3. These 
quantities can be obtained for every possible value of (3 from the above recursion (17, 
18). Of course, in practice numerical simulation techniques are required to maximize the 
quasi-likelihood. 

The two unanswered questions about this approach are: 

• How good is the Kullback-Leibler projection on the Gaussian family used after the 
prediction step? 

• How good is taking { } + in the correction? 

In order to deal appropriately with the first of these questions one can make use of the 
concept of projection residual that was developed for the continuous time case in Brigo, 
Hanzon and Le Gland (1995) [7]. This concept can actually be used here, because the 
approximate filter used in [15] has in fact the interpretation of a continuous time Gaussian 
Projection Filter for a continuous time signal observed in discrete time. Of course the 
question about taking { } + arises because here one works with Gaussian densities. In 
order to avoid this problem one could try to work with a class of densities which have their 
support on the non-negative real halfline and work out the Projection Filter, for the model 
under investigation here, by using such a class of densities. 



4 Risk— minimizing hedging strategies under partial 
observation 

We shortly report the result of Fischer, Platen, and Runggaldier (1996) [13]. This is a 
significant case where filtering theory fits nicely a mathematical-finance setup. A financial 
market is considered over a time interval [0, T] with a risky asset, whose price is denoted 
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by S, and a bond, whose price is assumed identically equal to one. Under a martingale 
measure, we write 



Let Tt = a{S u , Z u : u < t} be the information represented by observation of S and 
Z up to time t. The process Z is a hidden Markov process (representing the state of the 
economy) with transition intensity matrix A. Let N t be the number of jumps of Z (number 
of changes in the economy) up to time t. The process y t represent observation of St in 
additive noise, reflecting the possibility that not all indicated prices are actually traded. 
Our observation process is denoted by Y t := [y t , N t j. Denote by y t := a{Y s , s < t} the 
information represented by observation of S and Z up to time t. We assume that St is 
fully observed. We consider a contingent claim H = H(St) to be priced at all t < T. We 
will consider two cases: full observations {Tt '■ t > 0} available, and partial observations 
{yt '■ t > 0} available. In both cases we are dealing with an incomplete market, since 
there are more sources of randomness than traded risky assets. Then perfect hedging 
with self-financing portfolios is not possible in general. We can still try to determine a 
mean self financing hedging strategy that minimizes a risk criterion related to the lack of 
self-financing. 

We begin by the case with full observations. The main ingredient is the Kunita - 
Watanabe decomposition. We are looking for a strategy (£t,Vt) (£t amount of stock, rj t 
amount of bond) such that 

i) £ t is T t predictable, i] t is T t adapted, and 



ii) £t St + i]t 1 = H (final value of the strategy equals the claim) 

iii) £ t S t + r) t 1 — J*o£ u dS u =: C t (£,,r]) (value - gains = constant) is a martingale (mean- 



iv) minimizes E {(Ct — Ct) 2 ]^} for each t (quadratic criterion) among all other strategies 
as in (i), (ii), (iii). 

The solution of this problem was derived by Follmer and Schweizer (1991) [14]. They 
proved, among other results, that if H e L 2 (J-'t,Q) (Q is a martingale measure for S), 
then 



B t = l, 



dS t = <j t (Z t )S t dW t 
dyt = A t S t dt + D t dV t . 




constant); 



where 
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is the Kunita-Watanabe decomposition (L is a martingale, orthogonal to S). 

In the case of partial observations, points (i), (iii) and (iv) are replaced respectively by 

i) £ t is y t predictable, i] t is y t adapted, and 

E{[ T \Z t \ 2 a t (Z t ) 2 S?dt\y }<oc. 
Jo 

iii) E{S t S t + rhl-f*&dS u \y t } = E{C t (S,ri)\y t } is a Q?,Q) -martingale; 

iv) minimizes E{{Ct — C t ) 2 \yt} among all other strategies as in (i), (ii), (iii). 

The solution of this second problem was given by Schweizer (1994) [25], see also Di Masi, 
Platen and Runggaldier (1995) [12] . 



E{H\T t ] = EH+ f gdS u + Lf, 
Jo 

f y E{j« o\(Z t ) Sj\y t \ „ y 

f ' E{*nz,t s?\y t] • "• = *tfTO-f.S- 

How can one compute C, H and E{H\y t } explicitly ? The solution of this problem was 
given by Di Masi, Kabanov and Runggaldier (1994) [11]. If H has polynomial growth, then 

g = g(S t ,Z t ) = -^u t (S t ,Z t ), E{H\y t } = E{u t (S t ,Z t )\y t }, 
where u t (x,i) = E{H\S t = x, Z t = i} solves 



1 d 2 

d t u t (x,i) + -a 2 (i)x 2 —u t (x,i) + ^k ij u t {x,j) = 0, u T (x,i) = H(x) 



2 w dx 2 



The y~ mean self-financing strategy can be computed via the conditional distribution 
of the unobserved state (S t , Z t ) given the observations y t . This is the filtering problem 
treated by Miller and Runggaldier (1996) [24]. 
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